Weighted Graph Clustering with Non-Uniform Uncertainties
نویسندگان
چکیده
We study the graph clustering problem where each observation (edge or no-edge between a pair of nodes) may have a different level of confidence/uncertainty. We propose a clustering algorithm that is based on optimizing an appropriate weighted objective, where larger weights are given to observations with lower uncertainty. Our approach leads to a convex optimization problem that is efficiently solvable. We analyze our approach under a natural generative model, and establish theoretical guarantees for recovering the underlying clusters. Our main result is a general theorem that applies to any given weight and distribution for the uncertainty. By optimizing over the weights, we derive a provably optimal weighting scheme, which matches the information theoretic lower bound up to logarithmic factors and leads to strong performance bounds in several specific settings. By optimizing over the uncertainty distribution, we show that nonuniform uncertainties can actually help. In particular, if the graph is built by spending a limited amount of resource to take measurement on each node pair, then it is beneficial to allocate the resource in a non-uniform fashion to obtain accurate measurements on a few pairs of nodes, rather than obtaining inaccurate measurements on many pairs. We provide simulation results that validate our theoretical findings.
منابع مشابه
Low-rank Matrix Recovery from Local Coherence Perspective
We investigate the robust PCA problem of decomposing an observed matrix intothe sum of a low-rank and a sparse error matrices via convex programming PrincipalComponent Pursuit (PCP). In contrast to previous studies that assume the supportof the sparse error matrix is generated by uniform Bernoulli sampling, we allow non-uniform sampling, i.e., entries of the low-rank matrix are ...
متن کاملWeighted Graph Clustering with Non-Uniform Uncertainties
ij b0 almost surely for all i, j and the condition (6) holds, then with high probability, we have kW E [W ]k c2 ⇣ b log n+ p ⇢n log n ⌘ (12) and UU> (W E [W ]) 1 c3 p b2 log n+ ⇢K log n K (13) for some universal constants c2, c3. We prove the lemma in Section 7.1.1 to follow. We now prove Theorem 1 assuming the two inequalities (12) and (13) in the lemma hold. For any matrix Y , we define...
متن کاملNumerical Solution of Seismic Wave Propagation Equation in Uniform Soil on Bed Rock with Weighted Residual Method
To evaluate the earth seismic response due to earthquake effects, ground response analyses are used to predict ground surface motions for development of design response spectra, to compute dynamic stresses and strains for evaluation of liquefaction hazards, and to determine the earthquake induced forces that can lead to instability of earth and earth-retaining structures. Most of the analytical...
متن کاملA Unified View of Kernel k-means, Spectral Clustering and Graph Cuts
Recently, a variety of clustering algorithms have been proposed to handle data that is not linearly separable. Spectral clustering and kernel k -means are two such methods that are seemingly quite different. In this paper, we show that a general weighted kernel k -means objective is mathematically equivalent to a weighted graph partitioning objective. Special cases of this graph partitioning ob...
متن کاملA partition-based algorithm for clustering large-scale software systems
Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawba...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014